First Steps towards the Semi-automatic Development of a Wordformation-based Lexicon of Latin

نویسندگان

  • Marco Passarotti
  • Francesco Mambrini
چکیده

Although lexicography of Latin has a long tradition dating back to ancient grammarians, and almost all Latin grammars devote to wordformation at least one part of the section(s) concerning morphology, none of the today available lexical resources and NLP tools of Latin feature a wordformation-based organization of the Latin lexicon. In this paper, we describe the first steps towards the semi-automatic development of a wordformation-based lexicon of Latin, by detailing several problems occurring while building the lexicon and presenting our solutions. Developing a wordformation-based lexicon of Latin is nowadays of outmost importance, as the last years have seen a large growth of annotated corpora of Latin texts of different eras. While these corpora include lemmatization, morphological tagging and syntactic analysis, none of them features segmentation of the word forms and wordformation relations between the lexemes. This restricts the browsing and the exploitation of the annotated data for linguistic research and NLP tasks, such as information retrieval and heuristics in PoS tagging of unknown words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

پارس مورف: تحلیلگر صرفی زبان فارسی

In this paper, the theoretical foundation, the way of implementation and the uses of Pars Morph, a Persian morphological analyzer is introduced. Pars Morph is a rule-based Persian morphological analysis system, which analyzes the internal structure of word in Persian and determines the grammatical category and function of the word parts. Pars Morph being in link with a lexicon covering about 45...

متن کامل

Object-Oriented Method for Automatic Extraction of Road from High Resolution Satellite Images

As the information carried in a high spatial resolution image is not represented by single pixels but by meaningful image objects, which include the association of multiple pixels and their mutual relations, the object based method has become one of the most commonly used strategies for the processing of high resolution imagery. This processing comprises two fundamental and critical steps towar...

متن کامل

Automatic Interpretation of UltraCam Imagery by Combination of Support Vector Machine and Knowledge-based Systems

With the development of digital sensors, an increasing number of high-resolution images are available. Interpretation of these images is not possible manually, which necessitates seeking for practical, fast and automatic solutions to solve the environmental and location-based management problems. The land cover classification using high-resolution imagery is a difficult process because of the c...

متن کامل

Formatio formosa est. Building a Word Formation Lexicon for Latin

English. This paper presents the steps undertaken for building a word formation lexicon for Latin. The types of word formation rules are discussed and the semiautomatic procedure to pair their input and output lexical items is evaluated. An on-line graphical query system to access the lexicon is described as well. Italiano. Questo articolo presenta le procedure di realizzazione di un lessico mo...

متن کامل

Attitudes towards English as an International Language (EIL) in Iran: Development and Validation of a New Model and Questionnaire

This study aimed at developing and validating a new model and instrument to explore attitudes of Iranian EFL learners towards English as an International Language (EIL). In so doing, the researchers followed several rigorous steps including extensive literature review, content selection, item generation, designing the rating scales and personal information part, Delphi technique, item revision,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012